Blar i NTNU Open på forfatter "Eeckhout, Lieven"

Balancing Performance Against Cost and Sustainability in Multi-Chip-Module GPUs

Zhang, Shiqing; Naderan-Tahan, Mahmood; Jahre, Magnus; Eeckhout, Lieven (Peer reviewed; Journal article, 2023)

MCM-GPUs scale performance by integrating multiple chiplets within the same package. How to partition the aggregate compute resources across chiplets poses a fundamental trade-off in performance versus cost and sustainability. ...

Characterizing Multi-Chip GPU Data Sharing

Zhang, Shiqing; Naderan-Tahan, Mahmood; Jahre, Magnus; Eeckhout, Lieven (Journal article; Peer reviewed, 2023)

Multi-chip Graphics Processing Unit (GPU) systems are critical to scale performance beyond a single GPU chip for a wide variety of important emerging applications. A key challenge for multi-chip GPUs, though, is how to ...

Delegated Replies: Alleviating Network Clogging in Heterogeneous Architectures

Zhao, Xia; Eeckhout, Lieven; Jahre, Magnus (Peer reviewed; Journal article, 2022)

Heterogeneous architectures with latency-sensitive CPU cores and bandwidth-intensive accelerators are attractive as they deliver high performance at favorable cost. These architectures typically have significantly more ...

GDP: Using Dataflow Properties to Accurately Estimate Interference-Free Performance at Runtime

Jahre, Magnus; Eeckhout, Lieven (Journal article; Peer reviewed, 2018)

Multi-core memory systems commonly share resources between processors. Resource sharing improves utilization at the cost of increased inter-application interference which may lead to priority inversion, missed deadlines ...

Get Out of the Valley: Power-Efficient Address Mapping for GPUs

Yuxi, Liu; Zhao, Xia; Jahre, Magnus; Wang, Zhenlin; Wang, Xiaolin; Lou, Yingwei; Eeckhout, Lieven (Journal article; Peer reviewed, 2018)

GPU memory systems adopt a multi-dimensional hardware structure to provide the bandwidth necessary to support 100s to 1000s of concurrent threads. On the software side, GPU-compute workloads also use multi-dimensional ...

HSM: A Hybrid Slowdown Model for Multitasking GPUs

Zhao, Xia; Jahre, Magnus; Eeckhout, Lieven (Chapter, 2020)

Graphics Processing Units (GPUs) are increasingly widely used in the cloud to accelerate compute-heavy tasks. However, GPU-compute applications stress the GPU architecture in different ways --- leading to suboptimal resource ...

MDM: The GPU Memory Divergence Model

Wang, Lu; Jahre, Magnus; Adileh, Almutaz; Eeckhout, Lieven (Chapter, 2020)

Analytical models enable architects to carry out early-stage design space exploration several orders of magnitude faster than cycle-accurate simulation by capturing first-order performance phenomena with a set of mathematical ...

Modeling Emerging Memory-Divergent GPU Applications

Wang, Lu; Jahre, Magnus; Adileh, Almutaz; Wang, Zhiying; Eeckhout, Lieven (Journal article; Peer reviewed, 2019)

Analytical performance models yield valuable architectural insight without incurring the excessive runtime overheads of simulation. In this work, we study contemporary GPU applications and find that the key performance-related ...

NUBA: Non-Uniform Bandwidth GPUs

Zhao, Xia; Jahre, Magnus; Tang, Yuhua; Zhang, Guangda; Eeckhout, Lieven (Chapter, 2023)

The parallel execution model of GPUs enables scaling to hundreds of thousands of threads, which is a key capability that many modern high-performance applications exploit. GPU vendors are hence increasing the compute and ...

SAC: Sharing-Aware Caching in Multi-Chip GPUs

Zhang, Shiqing; Naderan-Tahan, Mahmood; Jahre, Magnus; Eeckhout, Lieven (Chapter, 2023)

Bandwidth non-uniformity in multi-chip GPUs poses a major design challenge for its last-level cache (LLC) architecture. Whereas a memory-side LLC caches data from the local memory partition while being accessible by all ...

Selective Replication in Memory-Side GPU Caches

Zhao, Xia; Jahre, Magnus; Eeckhout, Lieven (Chapter, 2020)

Data-intensive applications put immense strain on the memory systems of Graphics Processing Units (GPUs). To cater to this need, GPU memory systems distribute requests across independent units to provide high bandwidth by ...

TEA: Time-Proportional Event Analysis

Gottschall, Björn; Eeckhout, Lieven; Jahre, Magnus (Chapter, 2023)

As computer architectures become increasingly complex and heterogeneous, it becomes progressively more difficult to write applications that make good use of hardware resources. Performance analysis tools are hence critically ...